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1 Introduction 

Research into the stylistic properties of translations is an issue which has received 
some attention in computational stylistics. Previous work by Rybicki (2006) 
on the distinguishing of character idiolects in the work of Polish author Hen¬ 
ryk Sienkiewicz and two corresponding English translations using Burrow’s Delta 
method concluded that idiolectal differences could be observed in the source texts 
and this variation was preserved to a large degree in both translations. This study 
also found that the two translations were also highly distinguishable from one 
another. 

Burrows (2002) examined English translations of Juvenal also using the Delta 
method, results of this work suggest that some translators are more adept at con¬ 
cealing their own style when translating the works of another author whereas other 
authors tend to imprint their own style to a greater extent on the work they trans¬ 
late. 

Our work examines the writing of a single author, Norwegian playwright Hen¬ 
rik Ibsen, and these writings translated into both German and English from Nor¬ 
wegian, in an attempt to investigate the preservation of characterization, defined 
here as the distinctiveness of textual contributions of characters. 
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2 Background 

Many studies in computational stylistics have focused on tasks which are related 
to those of authorship attribution but are not concerned with the notion of attribut¬ 
ing authorship to texts of unknown provenance. A related area of study is the 
idea of pastiche , an intended imitation of an author’s style in the same language, 
which contrasts with translation as an intended imitation of an authors style but in 
a different language. Somers & Tweedie (2003) conducted experiments involving 
pastiche, the author in question was Lewis Carroll and the pastiche was a mod¬ 
ern children’s fable written by Gilbert Adair called Alice through the Needle’s 
Eye in which the author attempted to imitate the style of Carroll in such works 
as Through the Looking Glass and Alice’s Adventures in Wonderland. Various 
techniques used in authorship attribution were used in the task, including meth¬ 
ods of lexical richness, principal component analysis, the cusum technique 1 , and 
others. Some methods distinguished the pastiche from the original and some did 
not. Somers & Tweedie (2003) conclude as follows: If a pastiche is indistinguish¬ 
able from the original by an authorship attribution method, can it be said that the 
pastiche is in fact a perfect imitation of the original, or is it the case flawed? In 
the case of translation which is of relevance to our current work, the question can 
be formulated in a different way: If a translation is highly similar stylistically to 
other works by the same translator, is the translation a faithful one? 

This current study builds on previous work detecting character voices in the 
poetry of Irish poet Brendan Kennedy by Vogel & Brisset (2007) and a study on 
characterization in playwrights by Vogel & Lynch (2008). These studies were 
concerned with the language used by authors in the creation of character. The 
tools used in this study were used in these previous studies. 


3 Experimental Setup 

For these experiments, three works by Henrik Ibsen were used, A Doll’s House 
(1879) Ghosts (1881), and The Master Builder (1892) 2 . The electronic versions 
of these plays were obtained from lbsen.net 3 and Project Gutenberg. The contri¬ 
butions of each character are extracted using PlayParser 4 . All stage instructions 
are discarded in this step, leaving only the remaining character dialogue. The 
method decomposes all texts associated with a category (here, persona or play) 
into chunks of equal size. Pairwise similarity metrics are computed for all chunks. 
The metric is just the average chi-square computation of the difference in distri¬ 
bution between pairs of files for each token appearing in either file. Different sorts 
of tokenization capture different linguistic features for which one might consider 
distributions within and across text categories. If the pairwise similarity scores 
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are rank ordered, then one can exploit the intuitions that a homogeneous category 
will have a smaller rank-sum than a heterogeneous one, and that arbitrary samples 
from a homogeneous category should be more like the rest of that category than 
alternative categories. The method also provides a way to measure degree of ho¬ 
mogeneity, the number of samples who are more like the rest of the category can 
be measured against a baseline creating by random sampling. See Vogel & Lynch 
(2008) for a more detailed account of the method. 

4 Experiments 

4.1 First Experiment 

The first experiment seeks to compare character homogeneity over different lan¬ 
guages. The second experiment compares two different translations of the same 
play in order to quantify similarity between parallel translations. Table Q] shows 


Play 

Language 

Translator 

Gespenster( Ghosts) 

German 

Sigurd Ibsen 

Ein Puppenhaus(A Doll’s House ) 

German 

Marie Von Borch 

Baumeister Solness(77ze Master Builder) 

German 

Marie Von Borch 

The Master Builder 

English 

William Archer & Edmund Gosse 

A Doll’s House 

English 

William Archer 

Ghosts 

English 

William Archer 

Ghosts 

English 

R Farquarson Sharp 


Table 1: Plays and Translators 

the plays and their respective translators. As mentioned, the first 10k of text per 
character was examined and this was split into 5 sections. Thus, the criteria for 
inclusion in the study was that the character should contain at least 10k of text and 
11 characters were examined, as detailed in Table [2] Only the version of Ghosts 
translated by Archer is used in the first experiment. The results named in the next 
section have statistical significance. 

The results for the first experiment showed that character homogeneity varies 
to some extent over the translations, the character idiolects are not necessarily 
preserved to the same degree as the originals. When letter frequencies are mea¬ 
sured, the Norwegian original language characters prove to be more homogeneous 
than the translations, examples include the character of Engstrand who is homoge¬ 
neous in English and Norwegian but not German, however, one character whose 
language remains distinct across all of the translations is Nora, the heroine from 
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Character 

Play 

Engstrand 

Ghosts 

Pastor Manders 

Ghosts 

Oswald 

Ghosts 

Mrs Alving 

Ghosts 

Helmer 

A Dolls House 

Krogstad 

A Dolls House 

MrsLinde 

A Dolls House 

Nora 

A Dolls House 

Aline 

The Master Builder 

Hilde 

The Master Builder 

Solness 

The Master Builder 


Table 2: Characters and their plays 


A Doll’s House and one of the typical strong female characters found in Ibsen’s 
drama. 5 However, when the play is taken as the category, we find that the chunks of 
personas from each play are more similar to the personas from the same play than 
from different plays, and this is consistent across languages. So while within char¬ 
acter homogeneity is not always preserved, the homogeneity of the plays remains 
relatively consistent across languages. 


5 The Second Experiment 

The second experiment sought to examine whether two translations of the same 
original text into the same language are distinguishable by translator as in the work 
by Rybicki which delineated the work by each, while observing the preservation 
of idiolect in each. The experimental setup was similar to the first experiment 
with the character contributions separated and split into five files each. This time, 
however, the characters from the two translations of Ghosts by William Archer 
and Robert Farquharson Sharp were compared with each other. 

Our findings were that the characters from Archer’s translation were more 
homogeneous in general than those of Sharp’s translation. Of the characters which 
were not homogeneous, the text segments were more similar to the segments of 
the same character by the corresponding author than any other writings by the 
same author. Sharp’s characters tended to be more similar to the corresponding 
Archer character more often than vice versa. This suggests that both authors have 
managed to perform faithful translations which are not highly influenced by their 
own writing style. It also suggests that Sharp may have used Ibsen’s translation as 
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a reference when crafting his own. 6 

This result contrasts with Rybicki (2006) who found that the two translations 
of Sienkiewicz separated cleanly from one another with a preservation of indi¬ 
vidual character idiolects. However, Rybicki makes clear that the two English 
translations were done almost one hundred years apart with the second translator 
taking specific steps to bring the language of Sienkiewicz into the 20th century. 
Also, we are aware that results between the studies of two different authors are not 
directly comparable and do not seek to draw definite parallels, merely to reflect 
on related work in the same sphere. 


6 Conclusion 

In this research, character idiolects in translation have been examined. Future 
work will involve using different metrics for comparison along with comparing 
different selections of text from the characters considered, along with the compar¬ 
isons of translations of different authors by the same translator. 


Notes 

'See Farringdon (1996) for a detailed explanation of the origins of this technique, including 
detailed examples of the method’s use in a legal setting. 

2 For the English versions of the plays, the print versions are collected in Ibsen, Archer, Aveling, 
Archer & Archer (1890), Sharp’s translations can be found in Sharp (1911), the collected works 
of Ibsen in German are to be found in Ibsen (1898) and the Norwegian collected works are found 
in Ibsen & Bull (1957) 

3 http://www.ibsen.net last verified January 6, 2015, contains comprehensive information 
about Ibsen’s life and work together with links to his plays in the original form and in transla¬ 
tion. 

4 A Java based tool designed for this purpose. Lynch & Vogel (2007), describes the creation 
and benchmarking of this particular program. 

’’Hedda Gabler being the other one who springs to mind, further studies may incorporate a 
wider range of plays and characters. 

6 It is not fully clear from any forewords to the e-texts when exactly the translations themselves 
were first published, however it does state that the first performance in English was in 1890, using 
Archers translation. Sharp’s translations were first published in 1911, according to 

http : / / www .leicestersecularsociety.org. uk/library_shelf . htm, last ver¬ 
ified January 6, 2015 
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